Tree-Based Contrast Subspace Mining for Categorical Data
نویسندگان
چکیده
منابع مشابه
Rough subspace-based clustering ensemble for categorical data
Clustering categorical data arising as an important problem of data mining has recently attracted much attention. In this paper, the problem of unsupervised dimensionality reduction for categorical data is first studied. Based on the theory of rough sets, the attributes of categorical data are decomposed into a number of rough subspaces. A novel clustering ensemble algorithm based on rough subs...
متن کاملSubspace Clustering for High Dimensional Categorical Data
A fundamental operation in data mining is to partition a given dataset into clusters such that objects in the same cluster are more similar to each other than objects in different clusters according to some defined criteria [2]. These criteria are usually defined in the form of some distance, and similarity is hence defined as follows, the smaller the distance is, the more similar the objects a...
متن کاملCPCQ: Contrast pattern based clustering quality index for categorical data
Clustering validation is concerned with assessing the quality of clustering solutions. Since clustering is unsupervised and highly explorative, clustering validation has been an important and long standing research problem. Existing validity measures, including entropy-based and distance-based indices, have significant shortcomings. Indeed, for many datasets from the UCI repository, they fail t...
متن کاملA New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining
Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...
متن کاملData Mining and Tree-Based Optimization
Consider a large collection of objects, each of which has a large number of attributes of several different sorts. We assume that there are data attributes representing data, attributes which are to be statistically estimated from these, and attributes which can be controlled or set. A motivating example is to assign a credit score to a credit card prospect indicating the likelihood that the pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computational Intelligence Systems
سال: 2020
ISSN: 1875-6883
DOI: 10.2991/ijcis.d.201020.001